A new biophysical metric for interrogating the information content in human genome sequence variation: Proof of concept.
نویسندگان
چکیده
The 21st century emergence of genomic medicine is shifting the paradigm in biomedical science from the population phenotype to the individual genotype. In characterizing the biology of disease and health disparities in population genetics, human populations are often defined by the most common alleles in the group. This definition poses difficulties when categorizing individuals in the population who do not have the most common allele(s). Various epidemiological studies have shown an association between common genomic variation, such as single nucleotide polymorphisms (SNPs), and common diseases. We hypothesize that information encoded in the structure of SNP haploblock variation in the human leukocyte antigen-disease related (HLA-DR) region of the genome illumines molecular pathways and cellular mechanisms involved in the regulation of host adaptation to the environment. In this paper we describe the development and application of the normalized information content (NIC) as a novel metric based on SNP haploblock variation. The NIC facilitates translation of biochemical DNA sequence variation into a biophysical quantity derived from Boltzmann's canonical ensemble in statistical physics and used widely in information theory. Our normalization of this information metric allows for comparisons of unlike, or even unrelated, regions of the genome. We report here NIC values calculated for HLA-DR SNP haploblocks constructed by Haploview, a product of the International Haplotype Map Project. These haploblocks were scanned for potential regulatory elements using ConSite and miRBase, publicly available bioinformatics tools. We found that all of the haploblocks with statistically low NIC values contained putative transcription factor binding sites and microRNA motifs, suggesting correlation with genomic regulation. Thus, we were able to relate a mathematical measure of information content in HLA-DR SNP haploblocks to biologically relevant functional knowledge embedded in the structure of DNA sequence variation. We submit that NIC may be useful in analyzing the regulation of molecular pathways involved in host adaptation to environmental pathogens and in decoding the functional significance of common variation in the human genome.
منابع مشابه
Information Dynamics of Whole Genome Adaptation
The human genome is a complex, dynamic information system that encodes principles of life and living systems. These principles are incorporated in the structure of human genome sequence variation and are foundational for the continuity of life and human survival. Using first principles of thermodynamics and statistical physics, we have developed analogous "genodynamic tools" for population geno...
متن کاملSingle Nucleotide Polymorphisms and Association Studies: A Few Critical Points
Uncovering DNA sequence variations that correlate with phenotypic changes, e.g., diseases, is the aim of sequence variation studies. Common types sequence variations are Single nucleotide polymorphism (SNP, pronounced snip).SNPs are the third-generation molecular marker. SNP represents a DNA sequence variant of a single base pair with the minor allele occurring in more than 1% of a given popula...
متن کاملComparative bioinformatics analysis of a wild diploid Gossypium with two cultivated allotetraploid species
Background: Gossypium thurberi is a wild diploid species that has been used to improve cultivated allotetraploid cotton. G. thurberi belongs to D genome, which is an important wild bio-source for the cotton breeding and genetic research. To a certain degree, chloroplast DNA sequence information are a versatile tool for species identification and phylogenetic implications in plants. Different ch...
متن کاملI-38: Chromosome Instability in The Cleavage Stage Embryo
Recently, we demonstrated chromosome instability (CIN) in human cleavage stage embryogenesis following in vitro fertilization (IVF). CIN not necessarily undermines normal human development (i.e. when remaining normal diploid blastomeres develop the embryo proper), however it can spark a spectrum of conditions, including loss of conception, genetic disease and genetic variation development. To s...
متن کاملSome notes on ``Common fixed point of two $R$-weakly commuting mappings in $b$-metric spaces"
Very recently, Kuman et al. [P. Kumam, W. Sintunavarat, S. Sedghi, and N. Shobkolaei. Common Fixed Point of Two $R$-Weakly Commuting Mappings in $b$-Metric Spaces. Journal of Function Spaces, Volume 2015, Article ID 350840, 5 pages] obtained some interesting common fixed point results for two mappings satisfying generalized contractive condition in $b$-metric space without the assumption of the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of computational biology and bioinformatics research
دوره 4 2 شماره
صفحات -
تاریخ انتشار 2012